SDSS Dataset and SkyServer Workloads
نویسنده
چکیده
From what I see in Figure 1, of the 1.3 million files in DR4, only 350K some files have any objects in them. What is also interesting is that the objects are not uniformly distributed over the data files, but that there is a normal-like distribution to the number of files with low/medium/high number of objects per file. Comment [IR1]: Are the 320M objects I have from DR4 representative of all the objects in DR4? If not, can you make a database for me that contains all the object coordinates (just RA DEC is enough) from DR5 and give me a link from where to download it from?
منابع مشابه
Text Mining Applied to SQL Queries: A Case Study for the SDSS SkyServer
SkyServer, the portal for the Sloan Digital Sky Survey (SDSS) catalog, provides data access tools for astronomers and scientific education. One of the interfaces allows users to enter ad hoc SQL statements to query the catalog, and has logged over 280 million queries since 2001. This paper describes text mining techniques and preliminary results on mining the logs of the SQL queries submitted t...
متن کاملData Mining the SDSS SkyServer Database
An earlier paper described the Sloan Digital Sky Survey’s (SDSS) data management needs [Szalay1] by defining twenty database queries and twelve data visualization tasks that a good data management system should support. We built a database and interfaces to support both the query load and also a website for ad-hoc access. This paper reports on the database design, describes the data loading pip...
متن کاملIdentifying User Interests within the Data Space - a Case Study with SkyServer
Many scientific databases nowadays are publicly available for querying and advanced data analytics. One prominent example is the Sloan Digital Sky Survey (SDSS)—SkyServer, which offers data to astronomers, scientists, and the general public. For such data it is important to understand the public focus, and trending research directions on the subject described by the database, i.e., astronomy in...
متن کاملExtending the SDSS Batch Query System to the National Virtual Observatory Grid
The Sloan Digital Sky Survey science database is approaching 2TB. While the vast majority of queries normally execute in seconds or minutes, this interactive execution time can be disproportionately increased by a small fraction of queries that take hours or days to run; either because they require non-index scans of the largest tables or because they request very large result sets. In response...
متن کاملAn Analysis of Usage Locality for Data-Centric Web Services (TR-2005-866)
The growing popularity of XML Web Services is resulting in a significant increase in the proportion of Internet traffic that involves requests to and responses from Web Services. Unfortunately, web service responses, because they are generated dynamically, are considered “uncacheable” by traditional caching infrastructures. One way of remedying this situation is by developing alternative cachin...
متن کامل